Los datos que se analizan proceden de la compilación de gente en Kaggle. La fecha del análisis empieza el 19 de Abril de 2020, utilizándo la versión 84 del web anterior.
# 1- Ejecutar la tabla del COVID con Python
import pandas as pd
datos=pd.read_csv("covid_19_clean_complete.csv")
datos.head(10)
## Province/State Country/Region ... Deaths Recovered
## 0 NaN Afghanistan ... 0 0
## 1 NaN Albania ... 0 0
## 2 NaN Algeria ... 0 0
## 3 NaN Andorra ... 0 0
## 4 NaN Angola ... 0 0
## 5 NaN Antigua and Barbuda ... 0 0
## 6 NaN Argentina ... 0 0
## 7 NaN Armenia ... 0 0
## 8 Australian Capital Territory Australia ... 0 0
## 9 New South Wales Australia ... 0 0
##
## [10 rows x 8 columns]
# 2- Lo mismo que arriba pero cargando python desde R
pd <- import("pandas") #cargamos la libreria pandas de python
datos <- pd$read_csv("covid_19_clean_complete.csv")
kable(head(datos,10)) #kable de la libreria knitr inicializada al principio, para maquear tablas
| Province/State | Country/Region | Lat | Long | Date | Confirmed | Deaths | Recovered |
|---|---|---|---|---|---|---|---|
| NaN | Afghanistan | 33.0000 | 65.0000 | 1/22/20 | 0 | 0 | 0 |
| NaN | Albania | 41.1533 | 20.1683 | 1/22/20 | 0 | 0 | 0 |
| NaN | Algeria | 28.0339 | 1.6596 | 1/22/20 | 0 | 0 | 0 |
| NaN | Andorra | 42.5063 | 1.5218 | 1/22/20 | 0 | 0 | 0 |
| NaN | Angola | -11.2027 | 17.8739 | 1/22/20 | 0 | 0 | 0 |
| NaN | Antigua and Barbuda | 17.0608 | -61.7964 | 1/22/20 | 0 | 0 | 0 |
| NaN | Argentina | -38.4161 | -63.6167 | 1/22/20 | 0 | 0 | 0 |
| NaN | Armenia | 40.0691 | 45.0382 | 1/22/20 | 0 | 0 | 0 |
| Australian Capital Territory | Australia | -35.4735 | 149.0124 | 1/22/20 | 0 | 0 | 0 |
| New South Wales | Australia | -33.8688 | 151.2093 | 1/22/20 | 0 | 0 | 0 |
# 3- Lo mismo que arriba pero con R directamente
datos <- read.csv("covid_19_clean_complete.csv")
kable(head(datos,10))
| Province.State | Country.Region | Lat | Long | Date | Confirmed | Deaths | Recovered |
|---|---|---|---|---|---|---|---|
| Afghanistan | 33.0000 | 65.0000 | 1/22/20 | 0 | 0 | 0 | |
| Albania | 41.1533 | 20.1683 | 1/22/20 | 0 | 0 | 0 | |
| Algeria | 28.0339 | 1.6596 | 1/22/20 | 0 | 0 | 0 | |
| Andorra | 42.5063 | 1.5218 | 1/22/20 | 0 | 0 | 0 | |
| Angola | -11.2027 | 17.8739 | 1/22/20 | 0 | 0 | 0 | |
| Antigua and Barbuda | 17.0608 | -61.7964 | 1/22/20 | 0 | 0 | 0 | |
| Argentina | -38.4161 | -63.6167 | 1/22/20 | 0 | 0 | 0 | |
| Armenia | 40.0691 | 45.0382 | 1/22/20 | 0 | 0 | 0 | |
| Australian Capital Territory | Australia | -35.4735 | 149.0124 | 1/22/20 | 0 | 0 | 0 |
| New South Wales | Australia | -33.8688 | 151.2093 | 1/22/20 | 0 | 0 | 0 |
# 4- Con R directamente usando la libreria Tidyverse poniendo los strings como string y no factores(por defecto)
datos <- read.csv("covid_19_clean_complete.csv", stringsAsFactors = F)
datos %>% head(10) %>% kable()
| Province.State | Country.Region | Lat | Long | Date | Confirmed | Deaths | Recovered |
|---|---|---|---|---|---|---|---|
| Afghanistan | 33.0000 | 65.0000 | 1/22/20 | 0 | 0 | 0 | |
| Albania | 41.1533 | 20.1683 | 1/22/20 | 0 | 0 | 0 | |
| Algeria | 28.0339 | 1.6596 | 1/22/20 | 0 | 0 | 0 | |
| Andorra | 42.5063 | 1.5218 | 1/22/20 | 0 | 0 | 0 | |
| Angola | -11.2027 | 17.8739 | 1/22/20 | 0 | 0 | 0 | |
| Antigua and Barbuda | 17.0608 | -61.7964 | 1/22/20 | 0 | 0 | 0 | |
| Argentina | -38.4161 | -63.6167 | 1/22/20 | 0 | 0 | 0 | |
| Armenia | 40.0691 | 45.0382 | 1/22/20 | 0 | 0 | 0 | |
| Australian Capital Territory | Australia | -35.4735 | 149.0124 | 1/22/20 | 0 | 0 | 0 |
| New South Wales | Australia | -33.8688 | 151.2093 | 1/22/20 | 0 | 0 | 0 |
str(datos)
## 'data.frame': 23580 obs. of 8 variables:
## $ Province.State: chr "" "" "" "" ...
## $ Country.Region: chr "Afghanistan" "Albania" "Algeria" "Andorra" ...
## $ Lat : num 33 41.2 28 42.5 -11.2 ...
## $ Long : num 65 20.17 1.66 1.52 17.87 ...
## $ Date : chr "1/22/20" "1/22/20" "1/22/20" "1/22/20" ...
## $ Confirmed : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Deaths : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Recovered : int 0 0 0 0 0 0 0 0 0 0 ...
#ponemos el nombre de las variables en castellano
colnames(datos) = c("Provincia_Estado", #variable cualitativa
"Pais_Region", #variable cualitativa
"Latitud", # N+ o S- , variable cuantitativo
"Longitud", #E+ o W- , variable cuantitativo
"Fecha", #aparece como factor
"Casos_Confirmados",
"Casos_Muertos",
"Casos_Recuperados"
)
datos %>% head() %>% kable()
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados |
|---|---|---|---|---|---|---|---|
| Afghanistan | 33.0000 | 65.0000 | 1/22/20 | 0 | 0 | 0 | |
| Albania | 41.1533 | 20.1683 | 1/22/20 | 0 | 0 | 0 | |
| Algeria | 28.0339 | 1.6596 | 1/22/20 | 0 | 0 | 0 | |
| Andorra | 42.5063 | 1.5218 | 1/22/20 | 0 | 0 | 0 | |
| Angola | -11.2027 | 17.8739 | 1/22/20 | 0 | 0 | 0 | |
| Antigua and Barbuda | 17.0608 | -61.7964 | 1/22/20 | 0 | 0 | 0 |
Las variables cualitativas(cualidades sin orden) se convierten con factor o bien as.factor (ej. ojos azules, marrones, negros,…)
Las variables ordinales, son como las cualitativas pero tienen un orden (suspenso<aprobado<notable<sobresaliente), se convierten con ordered.
Las variables cuantitativas se convierten con as.numeric.
Provincia_Estado y Pais_Region serían variables cualitativas.
Latitud y Longitud serían cuantitativas.
Fecha aparece como factor que habría que pasar a variable ordinal.
Casos_Confirmados, Casos_Muertos y Casos_Recuperdados son cuantitativas.
#Reocnvertimos los strings en factores (esto se hace para ver cómoo se haría)
datos$Provincia_Estado %<>% factor() # %<>% para que fluya en ambas direcciones (libreria magrittr)
datos$Pais_Region %<>% factor()
# datos$Fecha %<>% as.Date(format="%m/%d/%y") #para pasar a formato fecha
datos$Fecha %<>% mdy() #Pasar a Fecha usando la libreria lubridate
str(datos)
## 'data.frame': 23580 obs. of 8 variables:
## $ Provincia_Estado : Factor w/ 81 levels "","Alberta","Anguilla",..: 1 1 1 1 1 1 1 1 6 49 ...
## $ Pais_Region : Factor w/ 185 levels "Afghanistan",..: 1 2 3 4 5 6 7 8 9 9 ...
## $ Latitud : num 33 41.2 28 42.5 -11.2 ...
## $ Longitud : num 65 20.17 1.66 1.52 17.87 ...
## $ Fecha : Date, format: "2020-01-22" "2020-01-22" ...
## $ Casos_Confirmados: int 0 0 0 0 0 0 0 0 0 0 ...
## $ Casos_Muertos : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Casos_Recuperados: int 0 0 0 0 0 0 0 0 0 0 ...
\[Casos\ Confirmados = Muertos + Recuperados + Enfermos \]
#Ahora con 'mutate' de Tidyverse calculamos la columna y la añadimos a datos
datos %<>%
mutate(Casos_Enfermos = Casos_Confirmados - Casos_Muertos - Casos_Recuperados)
datos %>%
filter(Casos_Confirmados > 10000) %>% #filtramos con esa condicion
head() %>% #mostramos los primeros
kable() #presentar la tabla bonita
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados | Casos_Enfermos |
|---|---|---|---|---|---|---|---|---|
| Hubei | China | 30.9756 | 112.2707 | 2020-02-02 | 11177 | 350 | 295 | 10532 |
| Hubei | China | 30.9756 | 112.2707 | 2020-02-03 | 13522 | 414 | 386 | 12722 |
| Hubei | China | 30.9756 | 112.2707 | 2020-02-04 | 16678 | 479 | 522 | 15677 |
| Hubei | China | 30.9756 | 112.2707 | 2020-02-05 | 19665 | 549 | 633 | 18483 |
| Hubei | China | 30.9756 | 112.2707 | 2020-02-06 | 22112 | 618 | 817 | 20677 |
| Hubei | China | 30.9756 | 112.2707 | 2020-02-07 | 24953 | 699 | 1115 | 23139 |
datos %>% #Para ver si hay enfermos negativos, lo que indicaría mala toma de datos
filter(Casos_Enfermos <0) %>%
arrange(Provincia_Estado, Fecha) %>% # Oara ordenar las filas del datset
kable()
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados | Casos_Enfermos |
|---|---|---|---|---|---|---|---|---|
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-22 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-23 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-24 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-25 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-26 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-27 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-28 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-29 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-30 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-03-31 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-01 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-02 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-03 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-04 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-05 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-06 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-07 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-08 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-09 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-10 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-11 | 0 | 1 | 0 | -1 |
| Diamond Princess | Canada | 0.0000 | 0.0000 | 2020-04-12 | -1 | 1 | 0 | -2 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-24 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-25 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-26 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-27 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-28 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-29 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-30 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-31 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-01 | 168 | 6 | 168 | -6 |
#Vemos que en Hainan y en un sitio de Canada hay algo raro
datos %>%
filter(Provincia_Estado=="Hainan") %>%
kable()
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados | Casos_Enfermos |
|---|---|---|---|---|---|---|---|---|
| Hainan | China | 19.1959 | 109.7453 | 2020-01-22 | 4 | 0 | 0 | 4 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-23 | 5 | 0 | 0 | 5 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-24 | 8 | 0 | 0 | 8 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-25 | 19 | 0 | 0 | 19 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-26 | 22 | 0 | 0 | 22 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-27 | 33 | 1 | 0 | 32 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-28 | 40 | 1 | 0 | 39 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-29 | 43 | 1 | 0 | 42 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-30 | 46 | 1 | 1 | 44 |
| Hainan | China | 19.1959 | 109.7453 | 2020-01-31 | 52 | 1 | 1 | 50 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-01 | 62 | 1 | 1 | 60 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-02 | 64 | 1 | 4 | 59 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-03 | 72 | 1 | 4 | 67 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-04 | 80 | 1 | 5 | 74 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-05 | 99 | 1 | 5 | 93 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-06 | 106 | 1 | 8 | 97 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-07 | 117 | 2 | 10 | 105 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-08 | 124 | 2 | 14 | 108 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-09 | 131 | 3 | 19 | 109 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-10 | 138 | 3 | 19 | 116 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-11 | 144 | 3 | 20 | 121 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-12 | 157 | 4 | 27 | 126 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-13 | 157 | 4 | 30 | 123 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-14 | 159 | 4 | 43 | 112 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-15 | 162 | 4 | 39 | 119 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-16 | 162 | 4 | 52 | 106 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-17 | 163 | 4 | 59 | 100 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-18 | 163 | 4 | 79 | 80 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-19 | 168 | 4 | 84 | 80 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-20 | 168 | 4 | 86 | 78 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-21 | 168 | 4 | 95 | 69 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-22 | 168 | 4 | 104 | 60 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-23 | 168 | 5 | 106 | 57 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-24 | 168 | 5 | 116 | 47 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-25 | 168 | 5 | 124 | 39 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-26 | 168 | 5 | 129 | 34 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-27 | 168 | 5 | 131 | 32 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-28 | 168 | 5 | 133 | 30 |
| Hainan | China | 19.1959 | 109.7453 | 2020-02-29 | 168 | 5 | 148 | 15 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-01 | 168 | 5 | 149 | 14 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-02 | 168 | 5 | 151 | 12 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-03 | 168 | 5 | 155 | 8 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-04 | 168 | 5 | 158 | 5 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-05 | 168 | 6 | 158 | 4 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-06 | 168 | 6 | 158 | 4 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-07 | 168 | 6 | 158 | 4 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-08 | 168 | 6 | 159 | 3 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-09 | 168 | 6 | 159 | 3 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-10 | 168 | 6 | 159 | 3 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-11 | 168 | 6 | 159 | 3 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-12 | 168 | 6 | 160 | 2 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-13 | 168 | 6 | 160 | 2 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-14 | 168 | 6 | 160 | 2 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-15 | 168 | 6 | 160 | 2 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-16 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-17 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-18 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-19 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-20 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-21 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-22 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-23 | 168 | 6 | 161 | 1 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-24 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-25 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-26 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-27 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-28 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-29 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-30 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-03-31 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-01 | 168 | 6 | 168 | -6 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-02 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-03 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-04 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-05 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-06 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-07 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-08 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-09 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-10 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-11 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-12 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-13 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-14 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-15 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-16 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-17 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-18 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-19 | 168 | 6 | 162 | 0 |
| Hainan | China | 19.1959 | 109.7453 | 2020-04-20 | 168 | 6 | 162 | 0 |
#Corregimos el error
datos %>%
filter(Provincia_Estado=="Hainan", Casos_Enfermos<0) %>%
mutate(Casos_Recuperados= Casos_Recuperados + Casos_Enfermos,
Casos_Enfermos = 0)
## Provincia_Estado Pais_Region Latitud Longitud Fecha Casos_Confirmados
## 1 Hainan China 19.1959 109.7453 2020-03-24 168
## 2 Hainan China 19.1959 109.7453 2020-03-25 168
## 3 Hainan China 19.1959 109.7453 2020-03-26 168
## 4 Hainan China 19.1959 109.7453 2020-03-27 168
## 5 Hainan China 19.1959 109.7453 2020-03-28 168
## 6 Hainan China 19.1959 109.7453 2020-03-29 168
## 7 Hainan China 19.1959 109.7453 2020-03-30 168
## 8 Hainan China 19.1959 109.7453 2020-03-31 168
## 9 Hainan China 19.1959 109.7453 2020-04-01 168
## Casos_Muertos Casos_Recuperados Casos_Enfermos
## 1 6 162 0
## 2 6 162 0
## 3 6 162 0
## 4 6 162 0
## 5 6 162 0
## 6 6 162 0
## 7 6 162 0
## 8 6 162 0
## 9 6 162 0
"
datos_europa = datos[datos$Latitud >38 &
datos$Longitud >-25 &
datos$Longitud < 30
, ]
"
## [1] "\ndatos_europa = datos[datos$Latitud >38 & \n datos$Longitud >-25 &\n datos$Longitud < 30 \n , ]\n"
#Esto es lo mismo que abajo, pero mejor puesto
datos_europa = datos %>%
filter(Latitud > 38, between(Longitud, -25, 30))
nrow(datos_europa)
## [1] 4050
table(datos_europa$Pais_Region) %>%
as.data.frame() %>% #pasamos la 'table' a 'dataframe' para poder aplicar 'filter'
filter(Freq > 0) %>%
kable()
| Var1 | Freq |
|---|---|
| Albania | 90 |
| Andorra | 90 |
| Austria | 90 |
| Belarus | 90 |
| Belgium | 90 |
| Bosnia and Herzegovina | 90 |
| Bulgaria | 90 |
| Croatia | 90 |
| Czechia | 90 |
| Denmark | 180 |
| Estonia | 90 |
| Finland | 90 |
| France | 90 |
| Germany | 90 |
| Greece | 90 |
| Holy See | 90 |
| Hungary | 90 |
| Iceland | 90 |
| Ireland | 90 |
| Italy | 90 |
| Kosovo | 90 |
| Latvia | 90 |
| Liechtenstein | 90 |
| Lithuania | 90 |
| Luxembourg | 90 |
| Moldova | 90 |
| Monaco | 90 |
| Montenegro | 90 |
| Netherlands | 90 |
| North Macedonia | 90 |
| Norway | 90 |
| Poland | 90 |
| Portugal | 90 |
| Romania | 90 |
| San Marino | 90 |
| Serbia | 90 |
| Slovakia | 90 |
| Slovenia | 90 |
| Spain | 90 |
| Sweden | 90 |
| Switzerland | 90 |
| United Kingdom | 270 |
#Para ver como estaba Europa el 15 de Marzo
datos_europa %>%
filter(Fecha == ymd("2020-03-15")) %>%
kable()
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados | Casos_Enfermos |
|---|---|---|---|---|---|---|---|---|
| Albania | 41.15330 | 20.16830 | 2020-03-15 | 42 | 1 | 0 | 41 | |
| Andorra | 42.50630 | 1.52180 | 2020-03-15 | 1 | 0 | 1 | 0 | |
| Austria | 47.51620 | 14.55010 | 2020-03-15 | 860 | 1 | 6 | 853 | |
| Belarus | 53.70980 | 27.95340 | 2020-03-15 | 27 | 0 | 3 | 24 | |
| Belgium | 50.83330 | 4.00000 | 2020-03-15 | 886 | 4 | 1 | 881 | |
| Bosnia and Herzegovina | 43.91590 | 17.67910 | 2020-03-15 | 24 | 0 | 0 | 24 | |
| Bulgaria | 42.73390 | 25.48580 | 2020-03-15 | 51 | 2 | 0 | 49 | |
| Croatia | 45.10000 | 15.20000 | 2020-03-15 | 49 | 0 | 1 | 48 | |
| Czechia | 49.81750 | 15.47300 | 2020-03-15 | 253 | 0 | 0 | 253 | |
| Faroe Islands | Denmark | 61.89260 | -6.91180 | 2020-03-15 | 11 | 0 | 0 | 11 |
| Denmark | 56.26390 | 9.50180 | 2020-03-15 | 864 | 2 | 1 | 861 | |
| Estonia | 58.59530 | 25.01360 | 2020-03-15 | 171 | 0 | 1 | 170 | |
| Finland | 64.00000 | 26.00000 | 2020-03-15 | 244 | 0 | 10 | 234 | |
| France | 46.22760 | 2.21370 | 2020-03-15 | 4499 | 91 | 12 | 4396 | |
| Germany | 51.00000 | 9.00000 | 2020-03-15 | 5795 | 11 | 46 | 5738 | |
| Greece | 39.07420 | 21.82430 | 2020-03-15 | 331 | 4 | 8 | 319 | |
| Holy See | 41.90290 | 12.45340 | 2020-03-15 | 1 | 0 | 0 | 1 | |
| Hungary | 47.16250 | 19.50330 | 2020-03-15 | 32 | 1 | 1 | 30 | |
| Iceland | 64.96310 | -19.02080 | 2020-03-15 | 171 | 5 | 8 | 158 | |
| Ireland | 53.14240 | -7.69210 | 2020-03-15 | 129 | 2 | 0 | 127 | |
| Italy | 43.00000 | 12.00000 | 2020-03-15 | 24747 | 1809 | 2335 | 20603 | |
| Latvia | 56.87960 | 24.60320 | 2020-03-15 | 30 | 0 | 1 | 29 | |
| Liechtenstein | 47.14000 | 9.55000 | 2020-03-15 | 4 | 0 | 0 | 4 | |
| Lithuania | 55.16940 | 23.88130 | 2020-03-15 | 12 | 0 | 1 | 11 | |
| Luxembourg | 49.81530 | 6.12960 | 2020-03-15 | 59 | 1 | 0 | 58 | |
| Moldova | 47.41160 | 28.36990 | 2020-03-15 | 23 | 0 | 0 | 23 | |
| Monaco | 43.73330 | 7.41670 | 2020-03-15 | 2 | 0 | 0 | 2 | |
| Montenegro | 42.50000 | 19.30000 | 2020-03-15 | 0 | 0 | 0 | 0 | |
| Netherlands | 52.13260 | 5.29130 | 2020-03-15 | 1135 | 20 | 2 | 1113 | |
| North Macedonia | 41.60860 | 21.74530 | 2020-03-15 | 14 | 0 | 1 | 13 | |
| Norway | 60.47200 | 8.46890 | 2020-03-15 | 1221 | 3 | 1 | 1217 | |
| Poland | 51.91940 | 19.14510 | 2020-03-15 | 119 | 3 | 0 | 116 | |
| Portugal | 39.39990 | -8.22450 | 2020-03-15 | 245 | 0 | 2 | 243 | |
| Romania | 45.94320 | 24.96680 | 2020-03-15 | 131 | 0 | 9 | 122 | |
| San Marino | 43.94240 | 12.45780 | 2020-03-15 | 101 | 5 | 4 | 92 | |
| Serbia | 44.01650 | 21.00590 | 2020-03-15 | 48 | 0 | 0 | 48 | |
| Slovakia | 48.66900 | 19.69900 | 2020-03-15 | 54 | 0 | 0 | 54 | |
| Slovenia | 46.15120 | 14.99550 | 2020-03-15 | 219 | 1 | 0 | 218 | |
| Spain | 40.00000 | -4.00000 | 2020-03-15 | 7798 | 289 | 517 | 6992 | |
| Sweden | 63.00000 | 16.00000 | 2020-03-15 | 1022 | 3 | 1 | 1018 | |
| Switzerland | 46.81820 | 8.22750 | 2020-03-15 | 2200 | 14 | 4 | 2182 | |
| Channel Islands | United Kingdom | 49.37230 | -2.36440 | 2020-03-15 | 3 | 0 | 0 | 3 |
| Isle of Man | United Kingdom | 54.23610 | -4.54810 | 2020-03-15 | 0 | 0 | 0 | 0 |
| United Kingdom | 55.37810 | -3.43600 | 2020-03-15 | 1140 | 21 | 18 | 1101 | |
| Kosovo | 42.60264 | 20.90298 | 2020-03-15 | 0 | 0 | 0 | 0 |
\[d(x,y) = \sqrt{(x_{Lat}-y_{Lat})^2 + (x_{Long}-y_{Long})^2}\]
distancia_grados = function (x, y){
sqrt((x[1]-y[1])^2 + (x[2]-y[2])^2)
}
distancia_grados_zgz =function(x){
zgz = c(41.65167,-0.9650208) #coordenadas zgz
distancia_grados(x, zgz) #para que calcule la distancia de cada lugar a zgz
}
dist_zgz = apply(cbind(datos_europa$Latitud, datos_europa$Longitud), #unimos ambas columnas
MARGIN=1, #por filas =1
FUN= distancia_grados_zgz) #que aplique esta funcion
datos_europa %<>% #añadimos una columna a datos_europa
mutate(dist_zgz = dist_zgz)
#vamos a ver si tenemos casos cerca (aunque las coordenadas vienen todas en MAdrid, por ejemplo)
datos_europa %>%
filter(between(Fecha, dmy("15-3-2020"), dmy("17-3-2020")), #entre estas fechas
dist_zgz < 3) %>% #a 3 grados de distancia
kable()
| Provincia_Estado | Pais_Region | Latitud | Longitud | Fecha | Casos_Confirmados | Casos_Muertos | Casos_Recuperados | Casos_Enfermos | dist_zgz |
|---|---|---|---|---|---|---|---|---|---|
| Andorra | 42.5063 | 1.5218 | 2020-03-15 | 1 | 0 | 1 | 0 | 2.629576 | |
| Andorra | 42.5063 | 1.5218 | 2020-03-16 | 2 | 0 | 1 | 1 | 2.629576 | |
| Andorra | 42.5063 | 1.5218 | 2020-03-17 | 39 | 0 | 1 | 38 | 2.629576 |
world <- ne_countries(scale = "medium", returnclass = "sf")
datos$Pais_Region = factor(datos$Pais_Region, levels= c(levels(datos$Pais_Region), "United States"))
datos[datos$Pais_Region=="US",]$Pais_Region = "United States" #cambiamos US por United States
#cruzamos la tabla world con datos
world %>%
inner_join(datos, by=c("name" = "Pais_Region")) %>% #cruzamos la columna 'name' de la tabla world con la columna 'Pais_Region' de la tabla datos
filter(Fecha == dmy("15-03-2020")) %>% #filtramos
ggplot() +
geom_sf(color= "black", aes(fill= Casos_Confirmados)) +
# coord_sf(crs="+proj=laea + lat_0=50 + lon_0=10 + units=m + ellps=GRS80") + #para poner en modo perspectiva(ojo de pez) centrado en europa
scale_fill_viridis_c(option="plasma", trans = "sqrt") +
xlab("Longitud") + ylab("Latitud") +
ggtitle("Mapa del Mundo", subtitle = "COVID-19") -> gg
## Warning: Column `name`/`Pais_Region` joining character vector and factor,
## coercing into character vector
ggplotly(gg) #para poder hace zoom